-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Speedup A3C up to 3x (new training_iteration
function instead of execution_plan
) and re-instate Pong learning test.
#22126
Conversation
|
||
# Synch updated weights back to the particular worker. | ||
with self._timers[SYNCH_WORKER_WEIGHTS_TIMER]: | ||
weights = local_worker.get_weights(local_worker.get_policies_to_train()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
rllib/agents/a3c/a3c.py
Outdated
if global_vars: | ||
local_worker.set_global_vars(global_vars) | ||
|
||
# TODO: If we have processed more than one gradients |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so to be clear we haven't written to result in this pr, right? But we want to for logging purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still WIP. I need to add proper compilation of the results dict. The only thing that's missing is to combine those learner stats from all workers that have returned something from the async_parallel_requests
call further above. This shim implementation right now only returns the last one.
Let me finish this before merging, of course.
Merge pending results dict. |
…ad of `execution_plan`) and re-instate Pong learning test. (ray-project#22126)
…on instead of `execution_plan`) and re-instate Pong learning test." (ray-project#22250) Reverts ray-project#22126 Breaks rllib:tests/test_io
This PR:
training_iteration
function for A3C (alternative to existingexecution_plan
)._disable_execution_plan_api=True
).tuned_examples/a3c/pong-a3c.yaml
(16 worker, LSTM+CNN Atari problem).tuned_examples/a3c/pong-a3c.yaml
).Why are these changes needed?
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.